119 research outputs found
Unsupervised Neural Hidden Markov Models
In this work, we present the first results for neuralizing an Unsupervised
Hidden Markov Model. We evaluate our approach on tag in- duction. Our approach
outperforms existing generative models and is competitive with the
state-of-the-art though with a simpler model easily extended to include
additional context.Comment: accepted at EMNLP 2016, Workshop on Structured Prediction for NLP.
Oral presentatio
Recommended from our members
Metabolomics in conjunction with computational methods for supporting biomedical research: to improve functional resilience in age-related disorders
Metabolomics and lipidomics lay the foundation of personalized medicine. The technological advancements in mass spectrometry techniques in combination with computational algorithms and methods have enabled the study of small molecules (metabolites and lipids) for understanding the disease state and biological pathways, the identification of biomarkers and the generation of predictive models for patient management, as well as the identification of natural products as new leads in drug discovery research. The computational methods utilize large, complex datasets to gather insights about underlying biological processes, trends, and non-random patterns.
This dissertation focuses on research studies in which the integration of metabolomics with computational methods enabled the discovery of active natural products as leads to combat Alzheimer's disease, utilization of metabolomics and lipidomics workflows for utilization of optimal cutting temperature compound stored heart tissues with mass spectrometry and assess the effect of doxycycline on biochemical pathways associated with breast cancer.
The methodical pipeline and associated workflows and technologies is described in Chapter 2. The optimal design of the preanalytical workflows as well as the integration with the appropriate measurement technologies are important for successful metabolomics and lipidomics studies. In this thesis, pre-analytical workflows were developed and applied for sample extraction procedures to separate metabolites and lipids from tissues, cells and botanical extracts, and subsequent chromatographic separation with ultra-performance liquid chromatography (UPLC). The metabolite and lipid profile were detected using high-resolution mass spectrometry in conjunction with tandem mass spectrometry. For characterization of isomers travelling wave ion mobility mass spectrometry was utilized. Metabolomics and lipidomics approaches were enhanced by computational methods for data processing, data visualization and interpretation.
In this thesis, we developed LC-MS metabolomics approaches for the characterization of botanical extracts and applied and evaluated innovative bio-chemometrics approaches to assign the bioactive principles. The natural products research in the thesis focused on Centella asiatica botanicals have gained popularity for their potential to enhance cognitive function and brain vitality in aging. An important contribution to improve effective clinical trials is the availability of standardized Centella asiatica extracts to facilitate reproducible use to account for substantial variability across natural products using LC-MS/MS workflow. A secondary goal in this thesis was method development aimed to reduce reliance on bioactivity guided fractionation by combining flow injection mass spectrometry with innovative computational methods that allow rapid dereplication of natural products and assigning of bioactive natural products. The methodological pipeline in conjunction with the applied computational approaches will lead to a decrease in time needed for moving bioactive natural products to preclinical testing.
In another study, methodology was evaluated to allow the use of bio-banked heart tissue samples for subsequent biomarker discovery research. The research on optimal cutting temperature (OCT) embedded heart tissue was designed to determine the compatibility of OCT storage with UPLC-MS/MS lipidomics studies. The results show that OCT stored heart tissue is compatible with LC-MS/MS lipidomics - facilitating the use of bio-banked tissue samples for future studies. The critical evaluation of the developed workflow shows that LC-MS/MS lipidomics of OCT-banked tissues samples is reliable for the major lipid classes except for plasmalogens that would likely be underestimated with using the described protocol.
In the last research chapter in this thesis outlines studies designed to determine if a doxycycline (DOX)-dependent gene expression knockdown system is a viable strategy in the context of metabolomic studies of breast cancer cells for studying the biological effects of targeted gene silencing. This research utilized a workflow comprising of combination of NMR and mass spectrometry. NMR was utilized to identify polar metabolites. Hydrophilic interaction liquid chromatography was used in conjunction with MS/MS mass spectrometry to determine the effect of doxycycline on metabolites. Reversed phase ultra-performance liquid chromatography was utilized along with MSE mass spectrometry to assess the impact of doxycycline on lipids. The research indicated DOX-based gene expression knockdown strategies unexpectedly affected metabolic pathways in the breast cancer cell lines. This serves as a cautionary tale for use of doxycycline in gene silencing in metabolomics and lipidomics experiments.
The conclusion of thesis provides a summary of the insights obtained by using computational methods in metabolomics. It provides perspectives on future of patient management, discovery of compounds with potential for treatment of diseases, obtaining in-depth understanding of disease state using mass spectrometry-based metabolomics and lipidomics
DeepInf: Social Influence Prediction with Deep Learning
Social and information networking activities such as on Facebook, Twitter,
WeChat, and Weibo have become an indispensable part of our everyday life, where
we can easily access friends' behaviors and are in turn influenced by them.
Consequently, an effective social influence prediction for each user is
critical for a variety of applications such as online recommendation and
advertising.
Conventional social influence prediction approaches typically design various
hand-crafted rules to extract user- and network-specific features. However,
their effectiveness heavily relies on the knowledge of domain experts. As a
result, it is usually difficult to generalize them into different domains.
Inspired by the recent success of deep neural networks in a wide range of
computing applications, we design an end-to-end framework, DeepInf, to learn
users' latent feature representation for predicting social influence. In
general, DeepInf takes a user's local network as the input to a graph neural
network for learning her latent social representation. We design strategies to
incorporate both network structures and user-specific features into
convolutional neural and attention networks. Extensive experiments on Open
Academic Graph, Twitter, Weibo, and Digg, representing different types of
social and information networks, demonstrate that the proposed end-to-end
model, DeepInf, significantly outperforms traditional feature engineering-based
approaches, suggesting the effectiveness of representation learning for social
applications.Comment: 10 pages, 5 figures, to appear in KDD 2018 proceeding
Query Resolution for Conversational Search with Limited Supervision
In this work we focus on multi-turn passage retrieval as a crucial component
of conversational search. One of the key challenges in multi-turn passage
retrieval comes from the fact that the current turn query is often
underspecified due to zero anaphora, topic change, or topic return. Context
from the conversational history can be used to arrive at a better expression of
the current turn query, defined as the task of query resolution. In this paper,
we model the query resolution task as a binary term classification problem: for
each term appearing in the previous turns of the conversation decide whether to
add it to the current turn query or not. We propose QuReTeC (Query Resolution
by Term Classification), a neural query resolution model based on bidirectional
transformers. We propose a distant supervision method to automatically generate
training data by using query-passage relevance labels. Such labels are often
readily available in a collection either as human annotations or inferred from
user interactions. We show that QuReTeC outperforms state-of-the-art models,
and furthermore, that our distant supervision method can be used to
substantially reduce the amount of human-curated data required to train
QuReTeC. We incorporate QuReTeC in a multi-turn, multi-stage passage retrieval
architecture and demonstrate its effectiveness on the TREC CAsT dataset.Comment: SIGIR 2020 full conference pape
Documentary Linguistics and Computational Linguistics: A response to Brooks
National Foreign Language Resource Cente
- …